<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="utf-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <meta name="description" content="">
    <meta name="author" content="">
    <meta name="keyword" content="">
    <link rel="shortcut icon" href="/node-mongodb-native/img/favicon.png">

    <title>Nested Categories</title>

    <link href="/node-mongodb-native/css/bootstrap-theme.css" rel="stylesheet">
    <link href="/node-mongodb-native/assets/font-awesome/css/font-awesome.min.css" rel="stylesheet" />
    <link href="/node-mongodb-native/css/style.css" rel="stylesheet">
    <link href="/node-mongodb-native/css/style-responsive.css" rel="stylesheet" />
    <link href="/node-mongodb-native/css/monokai_sublime.css" rel="stylesheet" />

  </head>

  <body>

  <section id="container" class="">


      <header class="header black-bg">
            <div class="toggle-nav">
                <i class="fa fa-bars"></i>
                <div class="icon-reorder tooltips" data-original-title="Toggle Navigation" data-placement="bottom"></div>
            </div>


            <a href="/node-mongodb-native/" class="logo"><img src="/node-mongodb-native/img/logo-mongodb-header.png" style="height:40px;"></a>

            <div class="nav title-row" id="top_menu">
                <h1 class="nav top-menu"> Nested Categories </h1>
            </div>
      </header>



<aside>
    <div id="sidebar"  class="nav-collapse ">

        <ul class="sidebar-menu">




            <li class="sub-menu active">
            <a href="javascript:;" class="">
                <i class='fa fa-book'></i>

                <span>schema design</span>
                <span class="menu-arrow fa fa-angle-down"></span>
            </a>
                <ul class="sub open">

                    <li><a href="/node-mongodb-native/schema/chapter1/"> Introduction </a> </li>

                    <li><a href="/node-mongodb-native/schema/chapter2/"> Schema Basics </a> </li>

                    <li><a href="/node-mongodb-native/schema/chapter3/"> MongoDB Storage </a> </li>

                    <li><a href="/node-mongodb-native/schema/chapter4/"> Indexes </a> </li>

                    <li><a href="/node-mongodb-native/schema/chapter5/"> Metadata </a> </li>

                    <li><a href="/node-mongodb-native/schema/chapter6/"> Time Series </a> </li>

                    <li><a href="/node-mongodb-native/schema/chapter7/"> Queues </a> </li>

                    <li class="active"><a href="/node-mongodb-native/schema/chapter8/"> Nested Categories </a> </li>

                    <li><a href="/node-mongodb-native/schema/chapter9/"> Account Transactions </a> </li>

                    <li><a href="/node-mongodb-native/schema/chapter10/"> Shopping Cart </a> </li>

                    <li><a href="/node-mongodb-native/schema/chapter11/"> Sharding </a> </li>

                </ul>

              </li>


                <li>
                <a class="" href="https://mongodb.github.io/node-mongodb-native/2.2">
                    <i class='fa fa-file-text'></i>

                    <span>2.0 driver docs</span>
                </a>

              </li>


                <li>
                <a class="" href="https://mongodb.github.io/node-mongodb-native/2.2">
                    <i class='fa fa-file-text'></i>

                    <span>1.4 driver docs</span>
                </a>

              </li>


                <li>
                <a class="" href="https://mongodb.github.io/node-mongodb-native/2.2">
                    <i class='fa fa-file-text'></i>

                    <span>Core driver docs</span>
                </a>

              </li>

            <li> <a href="https://groups.google.com/forum/#!forum/learn-mongo-db-the-hardway" target="blank"><i class='fa fa-life-ring'></i>Issues & Help</a> </li>


        </ul>

    </div>
</aside>




      <section id="main-content">
          <section class="wrapper">













              <div class="row">
                  <div class="col-md-1">


                      <a class="navigation prev" href="/schema/chapter7">
                      <i class="fa fa-angle-left"></i>
                      </a>


                  </div>
                <div class="col-md-10">
                    <section class="panel">



                    <div class="panel-body">



<h1 id="nested-categories:32486c278e1ddf0a50ecd2904ca850af">Nested Categories</h1>

<p>The nested categories schema design pattern targets the hierarchical structures traditionally found in a product catalog on an line e-commerce website.</p>

<p><img src="/images/originals/hierarchy.png" alt="A Category Hierarchy Example" />
</p>

<h2 id="trees-using-paths:32486c278e1ddf0a50ecd2904ca850af">Trees using Paths</h2>

<p>Let&rsquo;s insert the category <strong>/electronics/embedded</strong> and a product that belongs in this category.</p>

<pre><code class="language-js">var categories = db.getSisterDB(&quot;catalog&quot;).categories;
categories.insert([{
    &quot;name&quot;: &quot;electronics&quot;
  , &quot;parent&quot;: &quot;/&quot;
  , &quot;category&quot;: &quot;/electronics&quot;
}, {
    &quot;name&quot;: &quot;embedded&quot;
  , &quot;parent&quot;: &quot;/electronics&quot;
  , &quot;category&quot;: &quot;/electronics/embedded&quot;
}, {
    &quot;name&quot;: &quot;cases&quot;
  , &quot;parent&quot;: &quot;/&quot;
  , &quot;category&quot;: &quot;/cases&quot;
}, {
    &quot;name&quot;: &quot;big&quot;
  , &quot;parent&quot;: &quot;/cases&quot;
  , &quot;category&quot;: &quot;/cases/big&quot;
}, {
    &quot;name&quot;: &quot;small&quot;
  , &quot;parent&quot;: &quot;/cases&quot;
  , &quot;category&quot;: &quot;/cases/small&quot;
}]);

var products = db.getSisterDB(&quot;catalog&quot;).products;
products.insert({
    &quot;name&quot;: &quot;Arduino&quot;
  , &quot;cost&quot;: 125
  , &quot;currency&quot;: &quot;USD&quot;
  , &quot;categories&quot;: [&quot;/electronics/embedded&quot;]
});
</code></pre>

<p>In the Trees as Paths we are using what looks like file directory paths from UNIX. Together with regular expressions we can slice the paths as we need to efficiently retrieve any level of the tree structure.</p>

<p>A couple of notes about the schema above. Notice that the product has an array of <strong>categories</strong>. This lets you easily list the same product in multiple categories (maybe by attributes such as embedded but also Arduino).</p>

<p>Lets fetch all the categories just below the top level <strong>/</strong></p>

<pre><code class="language-js">var col = db.getSisterDB(&quot;catalog&quot;).categories;
var categories = col.find({parent: /^\/$/}).toArray();

for(var i = 0; i &lt; categories.length; i++) {
  print(categories[i].category);
}
</code></pre>

<p>Notice the regular expression <strong>/^\/$/</strong>. This translates to all documents where the field <strong>parent</strong> starts and ends with <strong>/</strong>.</p>

<p>Locate the entire tree structure below the cases level <strong>/cases</strong></p>

<pre><code class="language-js">var col = db.getSisterDB(&quot;catalog&quot;).categories;
var categories = col.find({category: /^\/cases\//}).toArray();

for(var i = 0; i &lt; categories.length; i++) {
  printjson(categories[i]);
}
</code></pre>

<p>Notice the regular expression <strong>/^\/cases\/$/</strong>. This matches all documents where the field <strong>category</strong> starts with <strong>/cases/</strong>, thus returning all the categories in the subtree below <strong>/cases</strong></p>

<p>To locate all the products for the <strong>/electronics/embedded</strong> category</p>

<pre><code class="language-js">var col = db.getSisterDB(&quot;catalog&quot;).products;
var products = col.find({categories: /^\/electronics\/embedded$/}).toArray();

for(var i = 0; i &lt; products.length; i++) {
  printjson(products[i]);
}
</code></pre>

<p>This will match any documents where the categories array contains the string <strong>/electronics/embedded</strong>.</p>

<h3 id="indexes:32486c278e1ddf0a50ecd2904ca850af">Indexes</h3>

<p>Adding indexes ensure the lookup is as fast as possible as the database needs to spend less effort in locating the data. Below are some recommended indexes for the schema.</p>

<pre><code class="language-js">var col = db.getSisterDB(&quot;catalog&quot;).categories;
col.ensureIndex({parent:1})
col.ensureIndex({category:1})

var col = db.getSisterDB(&quot;catalog&quot;).products;
col.ensureIndex({categories:1})
</code></pre>

<h3 id="covered-indexes:32486c278e1ddf0a50ecd2904ca850af">Covered Indexes</h3>

<p>Covered indexes are indexes that contain enough information to be able to answer the query with the data stored in the index and are in general, many times faster than queries that need to search the documents. We could create a covered index for categories to allow for quick retrieval of a specific level.</p>

<pre><code class="language-js">var col = db.getSisterDB(&quot;catalog&quot;).categories;
col.ensureIndex({parent:1, name: 1})
</code></pre>

<p>The Index <strong>{parent:1, name:1}</strong> is a compound index and will contain both the <strong>parent</strong> and <strong>name</strong> field and can cover queries containing those fields.</p>

<p>Let&rsquo;s rewrite the query of all the categories just below the top level <strong>/</strong> and look at the explain plan.</p>

<pre><code class="language-js">var col = db.getSisterDB(&quot;catalog&quot;).categories;
col.find({parent: /^\/$/}, {_id:0, parent:1, name:1}).explain();
</code></pre>

<p>This should return a document result that contains a field <strong>indexOnly</strong> that is set to true indicating that the query can be answered by only using the index. However you do have to give up the <strong>_id</strong> field for this query but in many cases covered index queries can give a radical performance boost for a query where <strong>_id</strong> is not needed and a small set of fields need to be returned.</p>

<p>Knowing this we can rewrite the query for the direct sub categories of the path <strong>/</strong>.</p>

<pre><code class="language-js">var col = db.getSisterDB(&quot;catalog&quot;).categories;
var categories = col.find({parent: /^\/$/}, {_id:0, parent:1, name:1});

for(var i = 0; i &lt; categories.length; i++) {
  print(categories[i].category);
}
</code></pre>

<h3 id="pros-and-cons:32486c278e1ddf0a50ecd2904ca850af">Pros and Cons</h3>

<p><strong>Pros</strong></p>

<ul>
<li>Quick retrieval of a subtree, backed by index</li>
<li>Flexible</li>
</ul>

<p><strong>Cons</strong></p>

<ul>
<li>Expensive to retrieve the parent path for a specific node (ex:all parent categories of small)</li>
<li>Relies on regular expressions make it more complicated (wrong regexp&rsquo;s can cause collection scans)</li>
</ul>

<h2 id="trees-using-ancestors-array:32486c278e1ddf0a50ecd2904ca850af">Trees using Ancestors Array</h2>

<p>In Trees using Ancestor&rsquo;s Array each tree node contains it&rsquo;s node path allowing you to retrieve a single node and being able to retrieve all it&rsquo;s parents nodes in a single query. Below is the categories tree from the above example.</p>

<pre><code class="language-js">var categories = db.getSisterDB(&quot;catalog&quot;).categories;
categories.insert([{
  &quot;_id&quot;: &quot;root&quot;
} , {
    &quot;_id&quot;: &quot;electronics&quot;
  , &quot;tree&quot;: [&quot;root&quot;]
  , &quot;parent&quot;: &quot;root&quot;
}, {
    &quot;_id&quot;: &quot;embedded&quot;
  , &quot;tree&quot;: [&quot;root&quot;, &quot;electronics&quot;]
  , &quot;parent&quot;: &quot;electronics&quot;
}, {
    &quot;_id&quot;: &quot;cases&quot;
  , &quot;tree&quot;: [&quot;root&quot;]
  , &quot;parent&quot;: &quot;root&quot;
}, {
    &quot;_id&quot;: &quot;big&quot;
  , &quot;tree&quot;: [&quot;root&quot;, &quot;cases&quot;]
  , &quot;parent&quot;: &quot;cases&quot;
}, {
    &quot;_id&quot;: &quot;small&quot;
  , &quot;tree&quot;: [&quot;root&quot;, &quot;cases&quot;]
  , &quot;parent&quot;: &quot;cases&quot;
}, {
    &quot;_id&quot;: &quot;yellow&quot;
  , &quot;tree&quot;: [&quot;root&quot;, &quot;cases&quot;, &quot;small&quot;]
  , &quot;parent&quot;: &quot;small&quot;
}]);

var products = db.getSisterDB(&quot;catalog&quot;).products;
products.insert({
    &quot;name&quot;: &quot;Arduino&quot;
  , &quot;cost&quot;: 125
  , &quot;currency&quot;: &quot;USD&quot;
  , &quot;categories&quot;: [&quot;embedded&quot;]
});
</code></pre>

<p>Locating all the direct descendants of the <strong>cases</strong> tree node.</p>

<pre><code class="language-js">var col = db.getSisterDB(&quot;catalog&quot;).categories;
var categories = col.find({parent: &quot;cases&quot;}).toArray();

for(var i = 0; i &lt; categories.length; i++) {
  printjson(categories[i]);
}
</code></pre>

<p>Locate all the nodes that share the common parent node <strong>cases</strong> (allowing you to extract a subtree of categories)</p>

<pre><code class="language-js">var col = db.getSisterDB(&quot;catalog&quot;).categories;
var categories = col.find({tree: &quot;cases&quot;}).toArray();

for(var i = 0; i &lt; categories.length; i++) {
  printjson(categories[i]);
}
</code></pre>

<p>Locate all the parent nodes for the <strong>yellow</strong> node.</p>

<pre><code class="language-js">var col = db.getSisterDB(&quot;catalog&quot;).categories;
var nodes = col.findOne({_id: &quot;small&quot;}).tree;
var categories = col.find({_id: {$in: nodes}}).toArray();

for(var i = 0; i &lt; categories.length; i++) {
  printjson(categories[i]);
}
</code></pre>

<p>One thing to notice is that one can retrieve the entire path from the root for a specific node in two queries. This is in contrast to the Path based tree where there is no cheap way to retrieve it.</p>

<h3 id="indexes-1:32486c278e1ddf0a50ecd2904ca850af">Indexes</h3>

<p>Just as the Path based tree we can benefit from a couple of indexes to improve retrieval performance.</p>

<pre><code class="language-js">var col = db.getSisterDB(&quot;catalog&quot;).categories;
col.ensureIndex({parent:1})
col.ensureIndex({tree:1})

var col = db.getSisterDB(&quot;catalog&quot;).products;
col.ensureIndex({categories:1})
</code></pre>

<p>Notice that in this case we are not explicitly creating a <strong>_id</strong> as it&rsquo;s by default a unique index.</p>

<h3 id="pros-and-cons-1:32486c278e1ddf0a50ecd2904ca850af">Pros and Cons</h3>

<p><strong>Pros</strong></p>

<ul>
<li>Quick retrieval of all ascendants and descendants of a particular node</li>
<li>More straightforward to understand than the Path Based tree</li>
<li>Not reliant on regular expressions for matching</li>
</ul>

<p><strong>Cons</strong></p>

<ul>
<li>More expensive than the Path Based tree</li>
</ul>



                <div class="col-md-1">


                    <a class="navigation next" href="/schema/chapter9">
                        <i class="fa fa-angle-right"> </i>
                    </a>


                </div>
              </div>

          </section>
      </section>

  </section>


    <script src="/node-mongodb-native/js/jquery-2.1.1.min.js"></script>
    <script src="/node-mongodb-native/js/jquery.scrollTo.min.js"></script>
    <script src="/node-mongodb-native/js/bootstrap.min.js"></script>

    <script src="/node-mongodb-native/js/highlight.pack.js"></script>
    <script>hljs.initHighlightingOnLoad();</script>
    <script src="/node-mongodb-native/js/scripts.js"></script>
    <script async defer id="github-bjs" src="/node-mongodb-native/js/buttons.js"></script>
    <script>
  (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
  (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
  m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
  })(window,document,'script','//www.google-analytics.com/analytics.js','ga');

  ga('create', 'UA-41953057-1', 'auto');
  ga('send', 'pageview');

</script>
  </body>
</html>
